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We present a method for estimating the probabilities of outcomes of a quantum circuit using Monte 
Carlo sampling techniques applied to a quasiprobability representation. Our estimate converges to 
the true quantum probability at a rate determined by the total negativity in the circuit, using a 
measure of negativity based on the 1-norm of the quasiprobability. If the negativity grows at most 
polynomially in the size of the circuit, our estimator converges efficiently. These results highlight 
the role of negativity as a measure of non-classical resources in quantum computation. 


Estimating the probability of a measurement outcome 
in a quantum process using only classical methods is 
a longstanding problem that remains of acute interest 
today. Directly calculating such probabilities using the 
Born rule is inherently inefficient in the size of the quan¬ 
tum system, and efficiently estimating such probabilities 
for a generic quantum process is expected to be out of 
reach of classical computers. 

Nonetheless, there are interesting and nontrivial 
classes of quantum circuits for which we can efficiently 
estimate the probabilities of outcomes. The canonical 
example of such a class is that of stabilizer circuits. Such 
circuits can create highly-entangled states and perform 
many of the fundamental operations involved in quantum 
computing (teleportation, quantum error correction, dis¬ 
tillation of magic states) but the celebrated Gottesman- 
Knill theorem allows such circuits to be classically sim¬ 
ulated efficiently [T]. Other examples include fermionic 
linear optics/matchgates [213], and some classes of quan¬ 
tum optics HE). While these methods may be extended 
to include bounded numbers of operations outside of the 
class (for example. Ref. [T]), such extensions generally 
treat all operations outside of the class on an equal foot¬ 
ing (for example, the cost of adding noisy magic states 
is the same as adding pure magic states) and so do not 
provide any insight into the relative resources of different 
operations. 

In this Letter, we present a general method for esti¬ 
mating outcome probabilities for quantum circuits us¬ 
ing quasiprobability representations. Simulation meth¬ 
ods based on quasiprobability representations have a long 
history in physics |7|, and have recently been used in 
quantum computation to identify classes of operations 
that are efficiently simulatable HHin]. Our method al¬ 
lows for estimation in circuits wherein the quasiprobabil¬ 
ities may go negative. That is, while making the most 
efficient use of circuit elements that are represented non- 
negatively, it nonetheless provides an unbiased estima¬ 
tor of the true quantum outcome probability regardless 
of the inclusion of more general elements that are nega¬ 
tively represented. We quantify the performance of this 


method by providing an upper bound on the rate of con¬ 
vergence of this estimator that scales with a measure of 
the total amount of negativity in the circuit. 

Probability estimation. —Consider quantum circuits of 
the following form. The circuit initiates with N qudits 
(d-level quantum systems) in a product state, evolves 
through a circuit consisting of L = poly{N) elementary 
gates that act nontrivially on at most a fixed number of 
qudits (for example, 1- and 2-qudit gates), and termi¬ 
nates with a product measurement, i.e., an independent 
measurement of each qudit. Universal quantum compu¬ 
tation can be achieved with circuits of this form. Note 
that we do not include circuits with intermediate mea¬ 
surements and conditional operations based on their out¬ 
puts (we return to this consideration in the discussion). 

We aim to estimate the probability of a fixed outcome 
d = (oi,...,ojv) where Oj denotes the outcome of the 
measurement on the jth qudit. (Note that estimation of 
the probability of a fixed outcome is distinct from a sim¬ 
ulation as in Refs. [8119], wherein different outcomes are 
sampled from this distribution.) A natural benchmark 
for the precision of an estimator is the precision that can 
be obtained from sampling the quantum circuit itself. If 
we had access to a quantum computer that implemented 
a circuit in this class, then we could use it to estimate 
the probability of a fixed outcome by computing the ob¬ 
served frequency /s(d) of outcome o over s samples. By 
the Hoeffding inequality, /s(d) will be within e of the 
quantum probability p{o} with probability 1 — J provided 
the number of samples s(e, J) satisfies 

s(e,J) > 2^1og(2/(5). (1) 

This bound implies that for any fixed 6, the number 
of samples required to achieve e error scales polynomi¬ 
ally in 1/e. We call estimators satisfying this property 
poly-precision estimators. (We distinguish these from 
exponential-precision estimators, defined as estimators 
for which s(e, <5) scales logarithmically in 1/e.) 

Our central results are a classical algorithm that pro¬ 
duces a poly precision estimate of a quantum circuit in 
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the above class, and a bound on the efficiency of this al¬ 
gorithm based on a measure of the circuit’s negativity in 
a quasiprobability representation. 

Quasiprobability representations .—A quasiprobability 
representation of a qudit over A is defined [m [m by 
a frame {F{X) : A G A} and a dual frame {G(A) : 
A G A}, which are (generally over-complete) bases for 
the space of Hermitian operators acting on satisfying 
A = X)aga ^('^)Tr[AF(A)] for all A. The space A can 
be continuous or discrete, and although many quasiprob- 
ability representations assume a phase space (symplec- 
tic) structure on A, this is not necessary. We can de¬ 
fine quasiprobability distributions on A associated with 
a quantum state p, a unitary operator U and a measure¬ 
ment effect E to be 

Wp{\) = Tr[F(A)p], 

W, 7 (A'|A) = Tr(F(A')C/G(A)f/1'), 

W{E\X) = Tr[AG(A)]. (2) 

Tensor products of these dual frames gives a dual frame 
for the product space, and so these definitions extend 
in the obvious way from a tensor product of N 

qudits to distributions on a phase space A^. 

The distribution Wp{\) is real-valued and satisfies 
Saga much like a probability distribution, if 

the frame is normalized using SagA^('^) ~ Similarly, 
the distributions VT( 7 (A'|A) and W{E\X) are normalized 
like corresponding conditional probabilities. The Born 
rule Pr(if|p, U) = Tv{EUpW), which gives the quantum 
probability for a measurement outcome given the state 
and process, is reproduced in the quasiprobability repre¬ 
sentation as would be expected in a probabilistic theory, 
by 

Pr(ii;|p,;7)= ^ W{E\X')Wu{X'\X)Wp{X). (3) 

A.A'gA 


Analogously, we define the negativity Af _e of a measure¬ 
ment effect E to be 

= E (5) 

aga 

and the point-negativity M.u{X) and negativity Afof a 
unitary U to be 

Mu{X) = V \Wu{X'\X)\ , Mu = rnaxMuiX) , (6) 

^' AgA 

A'gA 

respectively. The negativities for states, unitaries and ef¬ 
fects are lower-bounded by 1, 1 and Tr(£’) respectively, 
with equality if and only if the quasiprobability repre¬ 
sentation is nonnegative. These negativities will serve 
as a measure of the cost of each circuit element in our 
estimator. 

Estimation procedure. —A quasiprobability representa¬ 
tion provides an interpretation of the Born rule as the 
expectation value of a stochastic process. Specifically, 
if the quasiprobability representation of all elements in 
the circuit are nonnegative, then one may interpret the 
Born rule of Eq. (§ as the expected probability of the 
measurement outcome averaged over a set of trajectories 
through phase space mis]- This stochastic interpreta¬ 
tion no longer holds if the quasi-probability representa¬ 
tion for any of the input states, gates, or measurements 
is negative. A standard perspective is that, in a quan¬ 
tum description, different trajectories in phase space can 
be assigned negative weights and can interfere with each 
other m- Monte Carlo sampling techniques may still 
be used, but the key problem is to identify an appropri¬ 
ate distribution to sample trajectories A = (Aq, ..., Xu) 
through phase space, where Aq is associated with the 
preparation and A; and A;_i are associated with the /th 
unitary. Eq. (|^ becomes Pt{E\p,U) = X)Abh(A) with 


This equation follows from the Born rule using the defi¬ 
nition of the dual frames. 

Importantly, the distributions of a quasiprobability 
representation will generally take on negative values, and 
so cannot be directly interpreted as probability distribu¬ 
tions. The 1-norm of a quasiprobability distribution pro¬ 
vides a natural measure of the amount of negativity, i.e., 
how much it deviates from a true probability distribu¬ 
tion. We define the negativity Alp of a state p as the 
1 -norm of its quasiprobability representation, 

Mp = \\Wp\U = J2\Wp{X)\ . (4) 

AgA 

(The mana of a state using the discrete Wigner repre¬ 
sentation was introduced in Ref. m as a measure to 
bound the resources required for magic state distillation, 
and defined as the logarithm of the negativity used here.) 


W{X) = W{E\XL)[UtiWu{Xi\Xi.Q]Wp{Xo). (7) 


Using an approach reminiscent of quantum Monte Carlo 
methods for fermion systems, we could sample trajec¬ 
tories from a true (nonnegative) probability distribution 
obtained from the absolute value of the quasiprobabil¬ 
ity, keeping track of the sign of the sampled trajectory. 
Consider the distribution of trajectories given by 


Pr(A = A) 


I^A)! 

Me 


( 8 ) 


where Ale = l^(•^)l nieasures the negativity of the 

entire circuit, and we regard A as a realization of a ran¬ 
dom variable A. An estimate based on a single real¬ 
ization A is given by ( 71 (A) = Ale Sign[lU(A)], where 
Sign)-] = ±1 depending on the sign of the input. The 
expected value of this estimate gives the desired Born 
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rule probability 


{qi{A)) = ^< 7 i(A)Pr(A = A) = ^ Sign[lP(A)]|Vb(A)| 

A A 

= Y,W{X) = Pr{E\p,U). (9) 


Note that this estimator minimizes the range of A uni, 
and so provides the best bound on the number of samples 
required to obtain a fixed precision using the Hoeffding 


inequality. Sampling from the distribution (13) is also the 


optimal estimator over the space of trajectories in that it 
has the smallest variance (see Appendix). Unfortunately, 
without any additional structure, this estimator will in 
general be impractical for two reasons: there is no known 
efficient method to compute Adc, and sampling from the 


distribution (13) will in general be inefficient in N. 


To develop an efficient procedure, we sample trajecto¬ 
ries A following a Markov chain, using (true) probabilities 
and conditional probabilities at each timestep. Consider 
an input product state p = ®n=iPni which has an ef¬ 
ficient description Wp{X) in the quasi-probability repre¬ 
sentation which may be negative. We sample the initial 
point of the trajectory Aq from the modified distribution 
Pr(Ao) = \Wp{Xq)\ /M.p. We construct a full trajectory 
A by sampling A; at each timestep I = 1,..., L in the 
circuit from the conditional distribution Pr(Ai|Ai_i) = 
\WuiiXi\Xi-i)\ /Mut{Xi-i} given by the unitary gate Ui. 
If the unitary Ui at each timestep I of the circuit has an 
efficient description in the quasi-probability representa¬ 
tion, for example it consists of quantum gates acting on 
a fixed number of systems, then this distribution can be 
sampled efficiently. We note that this efficiency comes 
at a cost, as trajectories are no longer sampled from the 


optimal distribution (13). 


An estimate pi based on a single trajectory A of our 
Markov chain protocol is given by 


Pi (A) = AlpSign[Wp(Ao)] 


xn 


L 

1^1 


Mu, (Az-ijSignjlUi/, (Ai|Ai_i)] 


W{E\Xl). 

( 10 ) 


Unlike q, this estimate is guaranteed to be an efficiently 
computable function of the sampled path A. We note 
that Pi can lie outside the unit interval, but nonetheless 
gives an unbiased estimate of the Born rule probability, 
(pi(A)) = Pt{E\p,U), precisely as in Eq. 0- Further, 
we note that pi lies in the interval [—M^, +M^], where 
we have defined M^ to be the total forward negativity 
hound of the circuit: 


= Adpllf^i-^c/i maxAi, |W(A|Al)| . (11) 


Let Ps be the average of pi over s independent samples of 
A. Using the boundedness and unbiasedness properties 
of Pi , the Hoeffding inequality yields an upper bound on 


the rate of convergence of the average Ps ■ Specihcally, ps 
will be within e of the quantum probability Pv{E\p^ U) 
with probability 1 — d if a total of 


s{e,S) = ^Ad!, ln(2/(5) 


( 12 ) 


samples are taken. Consequently, if the total for¬ 
ward negativity bound Ad_> grows at most polynomi- 
ally with N, then our protocol gives an efficient esti¬ 
mate Ps of the quantum probability Pt{E\p, U) to within 
e = l/poly{N), with an exponentially small failure prob¬ 
ability. That is, for circuits with a polynomially-bounded 
total forward negativity bound, Ps is a poly-precision es¬ 
timator of the Born rule probability and we can sam¬ 
ple Ps efficiently in N. We note that the total for¬ 
ward negativity bound Ad_>. of © is insensitive to the 
measurement negativity Me, instead depending only on 
maxAi |1 U(£;|Al)|. 

Any efficiently computable symmetry of the Born rule 
can be used to give a variant on the procedure defined 
above. The rate of convergence of the estimator need not 
be symmetric under these Born rule symmetries, and so 
such a variant may provide an advantage. Two examples 
of such symmetries - the time reversal symmetry that 
exchanges states and measurement effects in a unitary 
circuit, and the regrouping of unitaries into different ele¬ 
mentary gates - are explored in the Appendix. In partic¬ 
ular, a variant procedure is presented for which the total 
negativity bound is insensitive to the negativity of the 
initial state Mp. 

Example: Estimation with the discrete Wigner 

function. —The odd-d qudit stabilizer subtheory and the 
associated discrete Wigner function provide a canonical 
example for demonstrating the use of our algorithm; see 
also Ref. m- Using this discrete Wigner representa¬ 
tion for our estimation algorithm, the nonnegativity of 
the stabilizer subtheory [n US] ensures that stabilizer 
states, gates, and rank-1 measurements have negativity 
Mp^/u,/E^ = 1 and so are “free” resources. Moreover, 
due to the existence of nonnegatively represented opera¬ 
tions that are not in the stabilizer polytope [5], our ap¬ 
proach is efficient on a strictly larger set of circuits than 
those of Ref. [T] . Circuits with operations possessing neg¬ 
ativity strictly greater than 1, such as magic states and 
non-Clifford gates, can still be estimated but now come 
at a cost. Provided the total negativity bound grows at 
most polynomially in N, our protocol provides an effi¬ 
cient estimator. 

As an example, consider a circuit with an input state 
given by a product state of k qutrit magic states :^(|0) -I- 
C|l) + C®|2)), with f = exp(27ri/9), together with stabi¬ 
lizer |0) states in a 100-qutrit random Clifford circuit, 
and estimate the probability of measuring |0) on the first 
qutrit of the output. The total forward negativity bound 
of this circuit scales exponentially in k and consequently 
the number of samples required to guarantee a hxed pre¬ 


cision scales exponentially in k by Eq. (12). The results 
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FIG. 1: (Color online) Plot of the difference Ps — ip) between 
the estimated probability and the true probability of the out¬ 
come |0){0| for the first qutrit as a function of the number 
of magic states k. Each data point represents a random 100- 
qutrit Clifford circuit with the non-magic states initialized to 
the |0) state. The number of samples s{k) was chosen us¬ 
ing Eq. ( |12[ ) with target precision e = 0.01 (indicated by the 
solid line) with 95% confidence (<5 = 0.05), so the number of 
samples increases exponentially with k (color scale). 

of our numerical simulations, shown in Fig. indicate 
that our estimator does indeed converge with an appro¬ 
priately chosen number of samples. Moreover, while the 
true precision of Ps in our simulations is often orders of 
magnitude better than the target precision, there are cir¬ 
cuits that come close to saturating the target precision, 
suggesting that our bound cannot be substantially im¬ 
proved without further detailed knowledge of the circuit. 

Discussion .—Our results highlight the role of the total 
negativity of a circuit as a resource required for a quan¬ 
tum computer to outperform any classical computer. In 
particular, any circuit element that is represented non- 
negatively does not contribute to the total negativity 
bound and can be viewed as a “free” resource within 
the algorithm. Other circuit elements have an associated 
cost quantified by their negativity, unless they appear at 
the final timestep of the algorithm. This latter observa¬ 
tion motivates us to exploit the time-reversal and other 
symmetries of the Born rule, seeking to minimize the to¬ 
tal forward (or reverse) negativity bound. In particular, 
one could seek equivalent circuits wherein negative op¬ 
erations can be replaced with nonnegative ones by using 
negative initial states or measurements via gate telepor¬ 
tation. By choosing the forward or reverse procedure as 
appropriate, the efficiency can be made insensitive to the 
negativity of these initial states or measurements. 

It also motivates us to identify quasiprobability rep¬ 
resentations in which many of the circuit elements of 
interest are represented nonnegatively. Interesting and 
relevant examples abound, beyond the well-studied qu- 
dit discrete Wigner function. Discrete Wigner functions 


for qubits (d = 2) can be defined for which all stabi¬ 
lizer states with real coefficients (rebits) and all CSS- 
preserving unitaries are nonnegatively represented [16) . 
The range of quasiprobability representations introduced 
in Ref. [T7] represent discrete subgroups of U{2) on a 
single qubit nonnegatively, but have no nonnegative en¬ 
tangling gates; as such representations can represent cer¬ 
tain non-stabilizer single-qubit states nonnegatively, they 
may be useful for estimation in circuits for gate synthesis. 
There is also flexibility in how a quasiprobability repre¬ 
sentation is defined. For example, a given quasiproba¬ 
bility representation can be modified to describe more 
states nonnegatively at the expense of a decreasing set of 
nonnegative measurements, and vice versa, by exploiting 
the structure of the dual frames. Overcomplete frames 
provides freedom in the choice of dual frame, and the neg¬ 
ativity of unitaries and measurements will depend on this 
choice. As the dual frames formalism itself captures the 
relationship between quantum states and measurements, 
there is also freedom in the definition of the quasiprob¬ 
ability representation of unitaries beyond that given by 
Eq. §. Finally, again using the freedom in the choice 
of dual for overcomplete frames, it is possible to switch 
between frames throughout a single circuit. These free¬ 
doms can be used to minimize the total negativity bound 
of the circuit, allowing more efficient estimators. 

Our procedure can be applied to infinite-dimensional 
Hilbert spaces using any of the range of quasiprobabil¬ 
ity representations with continuous phase spaces devel¬ 
oped in the study of quantum optics, by performing an 
appropriate discretization as in Ref. [5]. In this case, 
the negativity of distributions is quantified by integrat¬ 
ing the absolute value of the distributions over the phase 
space, and is directly related to the volume of negativ¬ 
ity [18]. We note that the resulting estimator can be 
applied to quantum optics experiments including states 
and measurements with negative Wigner function, such 
as photon number Fock states, and so may provide addi¬ 
tional insight into the classical simulation cost of boson 
sampling m- While there exist means to efficiently es¬ 
timate the outcome probability of a specific linear optics 
circuit with Fock state input and measurement |5], our 
estimation procedure extends these results by providing 
a general method for estimating outcome probabilities of 
such linear optical circuits for any input and output to¬ 
gether with a bound on the efficiency of this estimation 
based on the volume of negativity of these states. In 
addition, our estimation can easily incorporate squeez¬ 
ing, as well as the loss and noise mechanisms common to 
linear optics experiments. 

There are two natural ways to extend our results to 
circuits that include intermediate measurements and con¬ 
ditional operations based on them. First, one could re¬ 
place the measurement and conditional operation with a 
coherently-controlled operation, and delay the measure¬ 
ment to the end. We note that such controlled operations 
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can be negative, even if the measurement and classically- 
controlled operation are both nonnegative. Second, our 
algorithm can be used to directly estimate the proba¬ 
bilities of the intermediate measurements and to sample 
from them. In this case, the required precision is expo¬ 
nential in the number of intermediate measurements in 
order to calculate conditional probabilities for subsequent 
use in the algorithm. Thus, in general, both approaches 
require resources that are exponential in the number of 
intermediate measurements. 

Finally, our estimation procedure provides insight into 
the study of operationally meaningful measures of non- 
classical resources in quantum computation. Negativity 
in a quasiprobability representation has long been used 
as an indicator of quantum behaviour, but only recently 
has it been quantified as a resource for quantum com¬ 
putation m- Our results provide a related operational 
meaning of this resource: as a measure that bounds the 
efficiency of a classical estimation of probabilities. 
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Proof of optimality 


In this section of the Supplementary Information, we 
prove that sampling the distribution 


Pr(A = A) 


I^A)! 

Me 


(13) 


given in Eq. (8) is optimal in that it gives the minimum 
variance of the Born rule estimator of all distributions. 

Our goal is to estimate P = Pr(£’|p, U) = W{\). 

We will do so by sampling A from some distribution p(A), 
and compute an estimator given by IE(A)/p(A). The 
variance V of this estimator is given by 


^ = E 


[wjxW 

pW 


- 


(14) 


To minimize the variance, we choose p(A) to minimize 
the first term in this expression. 

Theorem 1. The distribution p{X) that minimizes the 
variance is p{X) oc |IT(A)|. 

Proof. The proof follows directly from the Cauchy- 
Schwarz inequality. Consider 



The inequality is saturated forp(A) oc |IT(A)|, and there¬ 
fore this distribution minimizes the variance V. □ 


With this choice, the variance of the estimator is 



where Me = I^('^)I' 


Exploiting symmetries of Born rule 


general) different estimators for this Born rule probabil¬ 
ity. The rate of convergence of these estimator need not 
be the same under this symmetry, and so such a variant 
may provide an advantage. 

As an example, consider the ‘time reversal’ symme¬ 
try that exchanges states and measurement effects in a 
unitary circuit (with some care taken to appropriately 
normalise the distributions for states and effects). One 
can define a “reverse protocol” which produces a poly 
precision estimator p(,, provided that the total reverse 
negativity of the circuit 

M^ = ME]\]f=iMjj]TaayiXo\Wp{\Q)\ , (17) 


is polynomially bounded. In general, M^ ^ M^, as 
seen from 


M^ _ Me maxAo |lTp(Ao)| 

M^ Mp maxA^ |1 T(£;|Al)| ’ 


(18) 


Because both M^ and M^ are efficiently computable, 
one is free to choose the direction of simulation re¬ 
sulting in the faster estimator convergence rate. (We 
note that Me > ^(£1) while Mp > 1, which sug¬ 
gests that the reverse protocol would have slower con¬ 
vergence when using a high rank effect. However, in 
such cases, max^^, |1 T(E|Al)| is in general larger than 
maxAo |lEp(Ao)| by a similar factor, cancelling the effect 
of Tr(i?) in the ratio (18).) 


Another symmetry of the Born rule is the the regroup¬ 
ing of unitaries into different ‘elementary’ gates, such as 
reexpressing U = UlUl-i ■ ■ ■ Ui as U = ■ ■ ■ U[. 

Different groupings can lead to different estimators, as 
we demonstrate with a simple example using a grouping 
of two unitaries into one, U = U 2 U 1 . We can estimate 
p = Tv{UpWE) by sampling trajectories A = (Aq, Ai, A 2 ) 
using Ui and C/ 2 , or directly by sampling X' = (Ao,A 2 ) 
using U = U 2 U 1 as a single step. While both of these 
methods will produce an unbiased estimator of the Born 
rule, they will not converge at the same rate in general, 
as a result of the general inequality 


In this section of the Supplementary Information, we 
detail ways to exploit an efficiently computable symmetry 
of the Born rule to give a variant of our estimation algo¬ 
rithm. If we replace the quantum circuit and measure¬ 
ment effect with another such that the Born rule proba¬ 
bility remains the same, then Eq. (10) provides two (in 


E 

AiGA 


\Wu,{\2\Xi)\ |lEc/,(Ai|Ao)| 

Mu2{.Xi) MuA^o) 


, \Wu{X2\Xo)\ 
MuiXo) 


(19) 

We note that equality holds in the case where Ui and U 2 
are both nonnegative. 

















